Acquistion of the Morphological Structure of the Lexicon Based on Lexical Similarity and Formal Analogy

نویسنده

  • Nabil Hathout
چکیده

The paper presents a computational model aiming at making the morphological structure of the lexicon emerge from the formal and semantic regularities of the words it contains. The model is purely lexemebased. The proposed morphological structure consists of (1) binary relations that connect each headword with words that are morphologically related, and especially with the members of its morphological family and its derivational series, and of (2) the analogies that hold between the words. The model has been tested on the lexicon of French using the TLFi machine readable dictionary. 1 Lexeme-based morphology Morphology is traditionally considered to be the field of linguistics that studies the structure of words. In this conception, words are made of morphemes which combine according to rules of inflexion, derivation and composition. If the morpheme-based theoretical framework is both elegant and easy to implement, it suffers many drawbacks pointed out by several authors (Anderson, 1992; Aronoff, 1994). The alternative theoretical models that have been proposed falls within lexeme-based or word-based morphology in which the minimal units are words instead of morphemes. Words then do not have any structure at all and morphology becomes a level of organization of the lexicon based on the sharing of semantic and formal properties. c © 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/). Some rights reserved. The morpheme-based / lexeme-based distinction shows up on the computational level. In the morpheme-based conception, the morphological analysis of a word aims at segmenting it into a sequence of morphemes (Déjean, 1998; Goldsmith, 2001; Creutz and Lagus, 2002; Bernhard, 2006). In a lexeme-based approach, it is to discover the relations between the word and the other lexical items. These relations serve to identify the morphological family of the word, its derivational series, and the analogies in which it is involved. For instance, the analysis of the French word dérivation may be considered as satisfactory if it connects dérivation with enough members of its family (dériver ‘derivate’, dérivationnel ‘derivational’, dérivable, dérive ‘drift’, dériveur ‘sailing dinghy’, etc.) and of its derivational series (formation ‘education’, séduction, variation, émission, etc.). Each of these relations is integrated into a large collection of analogies that characterizes it semantically and formally. For instance, the relation between dérivation and dérivable is part of a series of analogies which includes dérivation:dérivable::variation:variable, dérivation:dérivable::modification:modifiable, etc. Similarly, dérivation and variation participates in a series of analogies such as dérivation:variation::dériver:varier, dérivation:variation::dérivationnel:variationnel, dérivation:variation::dérivable:variable. 2 Computational modeling The paper describes a computational model aiming at making the morphological derivational structure of the lexicon emerge from the semantic and the formal regularities of the words it contains. A first experiment is currently underway on the lexicon of French using the TLFi machine readable dictio-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Lexicon-based Debates on the Felicity of Lexical Equivalents in Translating Literary Texts by Iranian EFL Learners

This study was an attempt to investigate the effect of lexicon-based debates on the felicity of lexical equivalents in translating literary texts by Iranian EFL learners.  To fulfill the purpose of this study, 59 university students, majoring in English Translation, were randomly assigned to the experimental and control groups from a total of 73 students based on their performance on a mock TOE...

متن کامل

On multiword lexical units and their role in maritime dictionaries

Multi-word lexical units are a typical feature of specialized dictionaries, in particular monolingual and bilingual maritime dictionaries. The paper studies the concept of the multi-word lexical unit and considers the similarities and differences of their selection and presentation in monolingual and bilingual maritime dictionaries. The work analyses such issues as the classification of multi-w...

متن کامل

Morphonette: a morphological network of French

This paper describes in details the first version of Morphonette, a new French morphological resource and a new radically lexeme-based method of morphological analysis. This research is grounded in a paradigmatic conception of derivational morphology where the morphological structure is a structure of the entire lexicon and not one of the individual words it contains. The discovery of this stru...

متن کامل

پارس مورف: تحلیلگر صرفی زبان فارسی

In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...

متن کامل

On the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian

The paper is focused on classes of lexical units which arise as a result of derivational processes – word formation and semantic transfers, acting either in isolation or together, on the basis of common semantic foundations that bind targets and sources of derivation. The lexical items which constitute the classes under study vary in their denotative characteristics and due to their categ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008